% etex_gen.tex --- How to generate e-TeX --- last modified 06 Mar 1998

\font\eighttt= cmtt8
\font\eightrm= cmr8
\font\rtitlefont= cmr7 scaled\magstep5
\font\ititlefont= cmmi7 scaled\magstep5
\def\titlefont{\rtitlefont \textfont1=\ititlefont}
\def\eTeX{$\varepsilon$-\TeX}
\def\NTS{NTS}
\let\mc=\eightrm
\rm
\let\mainfont=\tenrm

\def\.#1{\hbox{\tt#1}}
\def\\#1{\hbox{\it#1\/\hskip.05em}} % italic type for identifiers

\parskip 2pt plus 1pt
\baselineskip 12pt plus .25pt

\output{\shipout\box255\global\advance\pageno by 1} % for the title page only
\null
\vfill
\centerline{\titlefont How to generate \eTeX}
\vskip 6pt
\centerline{({\sl Version 2, March 1998\/})}
\vskip 18pt
\centerline{by The \NTS\ Team}
\vskip 6pt
\centerline{Peter Breitenlohner, Max-Planck-Institut f\"ur Physik, M\"unchen}
\vskip 6pt
\centerline{Philip Taylor, RHBNC, University of London}
\vfill
\centerline{\vbox{\hsize 4in
\noindent Given an implementation of \TeX82 for a particular system, this
report describes how to generate a corresponding implementation of
\eTeX.}}
\vskip 24pt
{\baselineskip 9pt
\eightrm\noindent
The preparation of this report was supported in part by DANTE,
Deutschsprachige Anwendervereinigung \TeX\ e.V.\hfil\break
`\TeX' is a trademark of the American Mathematical Society.

}\pageno=0\eject

\output{\shipout\vbox{ % for subsequent pages
    \baselineskip0pt\lineskip0pt
    \hbox to\hsize{\strut
      \ifodd\pageno \hfil\eightrm\firstmark\hfil
        \mainfont\the\pageno
      \else\mainfont\the\pageno\hfil
        \eightrm\firstmark\hfil\fi}
    \vskip 10pt
    \box255}
  \global\advance\pageno by 1}
\let\runninghead=\mark
\outer\def\section#1.{\noindent{\bf#1.}\quad
  \runninghead{\uppercase{#1} }\ignorespaces}

\section Introduction.
Let us first review the process of generating an implementation of
\TeX82 for a particular system from the source files as, e.g., described
in the \.{WEB} manual [1].  The system-independent source file
\.{tex.web} must remain unmodified.  All changes to \.{tex.web}
necessary for a particular operating system and\slash or compiler are to
be collected in a system-dependent change file, typically named
\.{tex.ch}.  Both files \.{tex.web} and \.{tex.ch} are effectively
merged when input by the \.{WEB} system programs \.{WEAVE} and
\.{TANGLE}.  When \.{WEAVE} processes this merged input, a file
\.{tex.tex} is produced.  Further processing by \TeX\ yields a `pretty
printed' version of the input together with an index.

When \.{TANGLE} processes the merged input, a string pool file
\.{tex.pool} and a Pascal file \.{tex.pas} (or similar) are produced.
The Pascal file can then be further processed by a suitable compiler
and\slash or language converter such as \.{web2c}, and eventually yields
an executable program.

There are actually three versions of the program:  First there is
\.{INITEX} with its capability to initialize all of \TeX's tables and to
write them in compact form to a format file.  Next there is the
production version \.{VIRTEX} requiring a format file as input.  Finally
there is \.{TRIPTEX}, a version of \.{INITEX} with special values for
some of \TeX's parameters, for the \.{TRIP} test [2] that should be used
to validate a \TeX\ implementation.  Depending on the capabilities of
the compiler, these three versions of the program are generated from
three slightly different change files or they are generated from one
change file with different compiler options.  They might even be one and
the same executable program used with different run time options.

\vskip 24pt plus 24pt
\section Generating \eTeX.
The process of generating \eTeX\ is essentially the same as that of
generating \TeX\ as described above.  Conceptually there is a
system-independent source file \.{etex.web} and a system-dependent change
file \.{etex.sys}.  Processing these two files by \.{TANGLE} yields a
string pool file \.{etex.pool} as well as a Pascal file \.{etex.pas},
whilst processing by \.{WEAVE} produces a \TeX\ source file, \.{etex.tex}.

It may, however, be necessary to increase some of the constants defined
in \.{TANGLE} and \.{WEAVE}.  The following values should suffice in
most cases:
$$
\vcenter{\halign{$#$\hfil\qquad&#\hfil\cr
\\{max\_bytes}\times\\{ww}=100~000&\.{TANGLE} and \.{WEAVE}\cr
\\{max\_texts}=2~500&\.{TANGLE} and \.{WEAVE}\cr
\\{max\_toks}\times\\{zz}=180~000&\.{TANGLE}\cr
\\{max\_names}=5~000&\.{TANGLE}\cr
\\{max\_scraps}=3~000&\.{WEAVE}\cr
\\{stack\_size}=300&\.{WEAVE}\cr
}}
$$

The source file \.{etex.web} for \eTeX\ does not, however, exist as a
physical file.  It is the hypothetical file obtained by applying the
changes in the actual source file \.{etex.ch} to \.{tex.web}.  Thus
\.{etex.web} inherits the bulk of code from \.{tex.web}, whilst the
system-independent source file \.{etex.ch} for \eTeX\ defines the
differences between \.{etex.web} and \.{tex.web}.  In order to generate
an implementation of \eTeX\ two change files have to be applied to
\.{tex.web}, one after the other (the actual file names may differ):
$$
\vcenter{\halign{#\hfil&\qquad\.{#}\hfil&\qquad#\hfil\cr
0.&tex.web&system-independent \.{WEB} source for \TeX\cr
1.&etex.ch&system-independent changes for \eTeX\cr
2.&etex.sys&system-dependent changes for \eTeX\cr
}}
$$

The process of merging several change files into \.{tex.web} should
certainly not be performed by hand.  There are programs such as \.{TIE}
and \.{PATCHWEB} that perform this process automatically.  If no such
program is available, a \TeX\ program \.{WEBMERGE} can be used.
\.{WEBMERGE} reads \.{tex.web} and up to nine change files and produces
a merged change file that can then be processed, together with
\.{tex.web}, by \.{TANGLE} and \.{WEAVE}.  [On systems such as VMS, use
of \.{WEBMERGE} can leave a large number of temporary files
lying around; this can be avoided by setting a version limit (e.g.~1) on any
existing versions of those files, or by setting a version limit on the
directory in which they will be created.  On other systems, it will probably
leave one large temporary file.]

Every implementor of \eTeX\ is responsible for creating and maintaining
a suitable \.{etex.sys} in the same way as every implementor of \TeX\
is responsible for creating and maintaining \.{tex.ch}.  Since the bulk
of code in \.{etex.web} is identical to that in \.{tex.web} the bulk of
the system-dependent changes in \.{etex.sys} for a particular system
will be identical to those in \.{tex.ch} for the same system.  In the
following we try to give some hints where \.{etex.sys} for a particular
system might deviate from the corresponding \.{tex.ch}.

First, it might be necessary to increase the size of \TeX's string pool
in order to accommodate \eTeX's additional strings (message texts as
well as multi-letter control sequences).  If this turns out to be
necessary for \eTeX\ it would certainly not be harmful to do it for
\TeX\ as well.  \TeX\ and \eTeX\ use three constants related to the
string pool:  \\{max\_strings} the maximal number of strings,
\\{pool\_size} the maximal number of string characters, and
\\{string\_vacancies} the minimal number of available string characters
in addition to those occupied by strings from the pool file.  It is,
therefore, sufficient to increase \\{pool\_size} (or reduce
\\{string\_vacancies}) by the number of \eTeX's additional string
characters and to increase \\{max\_strings} by the number of \eTeX's
additional strings.  The latter will, however, be unnecessary for most
implementations as \\{max\_strings} is usually increased substantially
beyond its standard value in order to accommodate large \TeX\ macro
packages.

For Version~2 of \eTeX, there are about 100 additional strings with
about 1500 additional string characters.  The precise numbers can be
obtained by running \.{POOLTYPE} on \TeX's and \eTeX's pool files
(\.{POOLTYPE} reports the total number of strings and string characters
in a pool file).

Next, \.{etex.sys} may contain a system-dependent modification of the
\\{eTeX\_banner} string.  The modified \\{eTeX\_banner} string must
contain `\.{e-TeX}' as well as the \eTeX\ version number.  Note,
however, that the \\{banner} string modified by \.{tex.ch} will not be
referenced by \eTeX\ unless the implementor intentionally changes that
aspect of \eTeX's functionality:  therefore \.{etex.sys} can modify the
\\{banner} string in the same way as does \.{tex.ch}.

Then, \.{etex.sys} might deviate from \.{tex.ch} in order to use a
different pool file name and\slash or format file extension (see below).

Finally, \.{etex.sys} will necessarily deviate whenever \.{etex.ch}
and \.{tex.ch} try to change the same piece of \.{WEB} code or when the
system-independent \eTeX\ changes from \.{etex.ch} and the
system-dependent \TeX\ changes from \.{tex.ch} interfere in some other way. 
In case of any such interference implementors must first of all determine
how to combine the respective changes from \.{etex.ch} and \.{tex.ch}
in order to obtain \eTeX's functionality for a particular system.
Obviously, this process cannot be automated since it requires human
insight.

The \NTS\ team has tried to formulate \.{etex.ch} such that
interferences with system-dependent change files \.{tex.ch} are
unlikely.  Suggestions by implementors how any remaining such
interferences could be avoided by a reformulation of \.{etex.ch} will
be taken into serious consideration.  Such interferences can be further
reduced by reformulating the system-dependent change file \.{tex.ch} for
\TeX, e.g.\ by reducing the range of change entries from entire modules
to the pieces of code that are actually changed.

Implementors might prefer to maintain the system-dependent change file
\.{etex.sys} not as a physical file but as a hypothetical file defined
through its deviation from \.{tex.ch}.  If there are no interferences of
the kind mentioned above, then the effect of applying the changes from
the hypothetical \.{etex.sys} to the hypothetical \.{etex.web} can be
achieved by applying 3 change files, one after the other, to \.{tex.web}
(using some tool such as \.{TIE}, \.{PATCHWEB}, or \.{WEBMERGE}):
$$
\vcenter{\halign{#\hfil&\qquad\.{#}\hfil&\qquad#\hfil\cr
0.&tex.web&system-independent \.{WEB} source for \TeX\cr
1.&etex.ch&system-independent changes for \eTeX\cr
2.&tex.ch&system-dependent changes for \TeX\cr
3.&tex.ech&additional system-dependent changes for \eTeX\cr
}}
$$
The third change file \.{tex.ech} will be rather short and contains just
the differences between \.{etex.sys} and \.{tex.ch}.  It is recommended
that implementors try to remove all interferences between \.{etex.ch}
and \.{tex.ch} and use this method to generate \eTeX.

As with \TeX\ there are three versions of \eTeX:  \.{e-INITEX},
\.{e-VIRTEX}, and \.{e-TRIPTEX}.  Depending on the implementation they
will again be generated from the three slightly different versions of
\.{tex.ch} or with different compiler options or they may be one and the
same program used with different run time options.

\vskip 24pt plus 24pt
\section \eTeX\ modes.
In order to ensure maximal compatibility with \TeX, \eTeX\ can run in
either compatibility mode or extended mode.  The possibility of this
choice is, of course, an extended feature of \eTeX.  Once \eTeX\ has
chosen compatibility mode it is, however, a legitimate implementation of
\TeX\ (assuming the \TeX\ implementation itself is legitimate).  The
only differences between \eTeX\ in compatibility mode and \TeX\ are
those allowed by D.~E.~Knuth [2] between different implementations of
\TeX.

An \.{e-TRIP} test suite [3] defines the criteria by which a program can
qualify to use the name `\eTeX'.  Part of the \.{e-TRIP} test consists
of the standard \.{TRIP} test for \.{e-TRIPTEX} in compatibility and
extended mode.

\eTeX\ can therefore be used instead of \TeX\ without the necessity to
maintain both programs.  For the case that both programs should
nevertheless co-exist on a system, it might be a good idea to name the
pool file for \eTeX\ \.{etex.pool} instead of \.{tex.pool} and to use
an extension other than \.{.fmt}, e.g., \.{.efmt} for \eTeX\ format
files.  (Format files for \TeX\ and \eTeX\ are incompatible).  All this
will require additional changes in the file \.{tex.ech}.

When \.{INITEX} or \.{VIRTEX} start, they inspect the first non-blank
character from the command line or in response to the \.{**} prompt.
This may be an \.{\&} immediately followed by the name of a format file
to be loaded.  Otherwise \.{VIRTEX} loads a default format, whereas
\.{INITEX} starts without loading a format.

When \.{e-INITEX} or \.{e-VIRTEX} start, they inspect the first
non-blank character from the command line or in response to the \.{**}
prompt.  This may again be an \.{\&} immediately followed by the name of
a format file to be loaded; otherwise \.{e-VIRTEX} loads a default
format.  For \.{e-INITEX} the first non-blank character may be an \.{*}
immediately followed by what would normally be the input for \.{INITEX}
(without intervening blanks).  \.{e-INITEX} enters extended mode in
response to the \.{*}, or compatibility mode otherwise.  This mode is
recorded in format files produced by \.{e-INITEX} and entered again when
such a format is loaded (by either \.{e-INITEX} or \.{e-VIRTEX}).

\vskip 24pt plus 24pt
\section References.
\item {[1]}
{\sl The \.{WEB} system of structured documentation\/},
by Donald E.~Knuth,\hfil\break Stanford Computer Science Report~980.

\item {[2]}
{\sl A torture test for \TeX\/},
by Donald E.~Knuth, Stanford Computer Science Report~1027.

\item {[3]}
{\sl A torture test for \eTeX\/},
by The \NTS\ Team (Peter Breitenlohner and Bernd Raichle).

\end
